MLDM Monday x Julia | Julia Tutorial III

Data science

杜岳華

Outline

  • DataFrame.jl
  • 資料清理
  • 資料視覺化
    • Gadfly.jl
    • Plots.jl

DataFrame

Install

當我們要處理表格類的資料的時候,DataFrame 無疑是首選的資料結構

比起使用 Array,他讓存取更便利,支援類似資料庫的 join 運算,也有好用的 aggregate function

請先安裝套件:

using Pkg
Pkg.add("DataFrames")
Pkg.add("RDatasets")
In [1]:
using DataFrames

Missing

  • missing 是用來表示一個遺失的資料點
  • missingMissing 的唯一物件實體
  • 包含 missing 的運算,當結果無法確定的時候都會回傳 missing

Calculation

In [2]:
true && missing
Out[2]:
missing
In [3]:
typeof(missing)
Out[3]:
Missing
In [4]:
1 + missing
Out[4]:
missing

Calculation

短路邏輯

In [5]:
false && missing
Out[5]:
false
In [6]:
true || missing
Out[6]:
true

分辨是否為 missing

In [7]:
ismissing(missing)
Out[7]:
true

DataFrame

  • DataFrame 是一種表格狀的資料結構
  • 每個行必須是相同長度,由 Array{T, 1} 構成
In [8]:
df = DataFrame(A = 1:4, B = ["M", "F", "F", "M"])
Out[8]:

4 rows × 2 columns

AB
Int64String
11M
22F
33F
44M

Select columns

In [9]:
df.A
Out[9]:
4-element Array{Int64,1}:
 1
 2
 3
 4
In [10]:
df[:A]
Out[10]:
4-element Array{Int64,1}:
 1
 2
 3
 4
In [11]:
df[1]
Out[11]:
4-element Array{Int64,1}:
 1
 2
 3
 4

Construct DataFrame by adding new columns

In [12]:
df = DataFrame()  # 先初始化一個空的,再放入資料
df[:A] = 1:8
df[:B] = ["M", "F", "F", "M", "F", "M", "M", "F"]
df
Out[12]:

8 rows × 2 columns

AB
Int64String
11M
22F
33F
44M
55F
66M
77M
88F

Index

可以用數字索引

In [13]:
df[1, 1]
Out[13]:
1

也可以用欄位(Symbol type)索引

In [14]:
df[1, :A]
Out[14]:
1

Index

In [15]:
df[1:3, :]
Out[15]:

3 rows × 2 columns

AB
Int64String
11M
22F
33F
In [16]:
df[1:3, [:B, :A]]
Out[16]:

3 rows × 2 columns

BA
StringInt64
1M1
2F2
3F3

DataFrame information

In [17]:
size(df)  # 取得維度資訊
Out[17]:
(8, 2)
In [18]:
nrow(df)  # 列數
Out[18]:
8
In [19]:
ncol(df)  # 行數
Out[19]:
2

DataFrame information

In [20]:
names(df)  # 取得欄位名稱
Out[20]:
2-element Array{Symbol,1}:
 :A
 :B
In [21]:
eltypes(df)
Out[21]:
2-element Array{DataType,1}:
 Int64 
 String

Data selection

In [22]:
df[df[:A] .% 2 .== 0, :]  # 取符合條件的列,跟所有欄位
Out[22]:

4 rows × 2 columns

AB
Int64String
12F
24M
36M
48F

Constructing row by row

In [23]:
df = DataFrame(A = Int[], B = String[])
Out[23]:

0 rows × 2 columns

AB
Int64String
In [24]:
push!(df, (1, "M"))
Out[24]:

1 rows × 2 columns

AB
Int64String
11M
In [25]:
push!(df, [2, "N"])
Out[25]:

2 rows × 2 columns

AB
Int64String
11M
22N

Constructing row by row

In [26]:
push!(df, Dict(:B => "F", :A => 3))
Out[26]:

3 rows × 2 columns

AB
Int64String
11M
22N
33F

Load

In [27]:
using CSV
In [28]:
file = "data.csv"
df = CSV.read(file)
Out[28]:

4 rows × 3 columns

name age squidPerWeek
String⍰Int64⍰Float64⍰
1Alice363.14
2Bob240.0
3Carol582.71
4Eve497.77

Save

In [29]:
CSV.write("data.tsv", df, delim='\t')
Out[29]:
"data.tsv"
In [30]:
df |> CSV.write("data.tsv", delim='\t')
Out[30]:
"data.tsv"

Categorical data

In [31]:
using CategoricalArrays
In [32]:
df = DataFrame()
df[:A] = 1:8
df[:B] = ["M", "F", "F", "M", "F", "M", "M", "F"]
Out[32]:
8-element Array{String,1}:
 "M"
 "F"
 "F"
 "M"
 "F"
 "M"
 "M"
 "F"

Categorical array

In [33]:
ca = CategoricalArray(df[:B])
Out[33]:
8-element CategoricalArray{String,1,UInt32}:
 "M"
 "F"
 "F"
 "M"
 "F"
 "M"
 "M"
 "F"
In [34]:
levels(ca)
Out[34]:
2-element Array{String,1}:
 "F"
 "M"

Categorical array

用在 dataframe 的 column 上

In [35]:
df[:B]
Out[35]:
8-element Array{String,1}:
 "M"
 "F"
 "F"
 "M"
 "F"
 "M"
 "M"
 "F"
In [36]:
levels(df[:B])
Out[36]:
2-element Array{String,1}:
 "F"
 "M"

Transform to categorical array

In [37]:
categorical!(df, :B)
Out[37]:

8 rows × 2 columns

AB
Int64Categorical…
11M
22F
33F
44M
55F
66M
77M
88F
In [38]:
eltypes(df)
Out[38]:
2-element Array{DataType,1}:
 Int64                    
 CategoricalString{UInt32}

Transform to categorical array

In [39]:
categorical!(df)
Out[39]:

8 rows × 2 columns

AB
Int64Categorical…
11M
22F
33F
44M
55F
66M
77M
88F

Practice!

那我們就來實際用DataFrame玩玩看資料吧!

In [40]:
using RDatasets

挑選你想要的資料集

In [41]:
RDatasets.packages()[1:10, :]
warning: failed parsing String on row=8, col=2, error=INVALID: OK, QUOTED, NEWLINE, INVALID_DELIMITER
Out[41]:

10 rows × 2 columns

PackageTitle
String⍰String⍰
1COUNTFunctions, data and code for count data.
2EcdatData sets for econometrics
3HSAURA Handbook of Statistical Analyses Using R (1st Edition)
4HistDataData sets from the history of statistics and data visualization
5ISLRData for An Introduction to Statistical Learning with Applications in R
6KMsurvData sets from Klein and Moeschberger (1997), Survival Analysis
7MASSSupport Functions and Datasets for Venables and Ripley's MASS
8SASmixedData sets from \\
9ZeligEveryone's Statistical Software
10adehabitatLTAnalysis of Animal Movements

這邊我們選的是iris這個資料集

In [42]:
iris = dataset("datasets", "iris")
first(iris, 10)
Out[42]:

10 rows × 5 columns

SepalLengthSepalWidthPetalLengthPetalWidthSpecies
Float64Float64Float64Float64Categorical…
15.13.51.40.2setosa
24.93.01.40.2setosa
34.73.21.30.2setosa
44.63.11.50.2setosa
55.03.61.40.2setosa
65.43.91.70.4setosa
74.63.41.40.3setosa
85.03.41.50.2setosa
94.42.91.40.2setosa
104.93.11.50.1setosa

看一下他有幾列幾行

In [43]:
size(iris)
Out[43]:
(150, 5)

把兩個相同的DataFrame垂直地併起來

In [44]:
size(vcat(iris, iris))
Out[44]:
(300, 5)

把兩個相同的 DataFrame 水平地併起來可以用 hcat

Dealing with missing data

In [45]:
messy_data = [1, 2, 3, missing, 4, 5, missing]
Out[45]:
7-element Array{Union{Missing, Int64},1}:
 1       
 2       
 3       
  missing
 4       
 5       
  missing
In [46]:
sum(messy_data)
Out[46]:
missing

Skip missing data

In [47]:
skipmissing(messy_data)
Out[47]:
Base.SkipMissing{Array{Union{Missing, Int64},1}}(Union{Missing, Int64}[1, 2, 3, missing, 4, 5, missing])
In [48]:
sum(skipmissing(messy_data))
Out[48]:
15

Impute missing data

missing 換成 10

In [49]:
map(x -> ismissing(x) ? 10 : x, messy_data)
Out[49]:
7-element Array{Int64,1}:
  1
  2
  3
 10
  4
  5
 10

看看每列是不是有出現 missing

true 代表沒有出現

In [50]:
completecases(iris)
Out[50]:
150-element BitArray{1}:
 true
 true
 true
 true
 true
 true
 true
 true
 true
 true
 true
 true
 true
    ⋮
 true
 true
 true
 true
 true
 true
 true
 true
 true
 true
 true
 true

選出那些完整的列

completecases!(iris) 一樣

In [51]:
iris[completecases(iris), :]
Out[51]:

150 rows × 5 columns

SepalLengthSepalWidthPetalLengthPetalWidthSpecies
Float64Float64Float64Float64Categorical…
15.13.51.40.2setosa
24.93.01.40.2setosa
34.73.21.30.2setosa
44.63.11.50.2setosa
55.03.61.40.2setosa
65.43.91.70.4setosa
74.63.41.40.3setosa
85.03.41.50.2setosa
94.42.91.40.2setosa
104.93.11.50.1setosa
115.43.71.50.2setosa
124.83.41.60.2setosa
134.83.01.40.1setosa
144.33.01.10.1setosa
155.84.01.20.2setosa
165.74.41.50.4setosa
175.43.91.30.4setosa
185.13.51.40.3setosa
195.73.81.70.3setosa
205.13.81.50.3setosa
215.43.41.70.2setosa
225.13.71.50.4setosa
234.63.61.00.2setosa
245.13.31.70.5setosa
254.83.41.90.2setosa
265.03.01.60.2setosa
275.03.41.60.4setosa
285.23.51.50.2setosa
295.23.41.40.2setosa
304.73.21.60.2setosa

或是你想要唯一的列元素組合

In [52]:
unique(iris)
Out[52]:

149 rows × 5 columns

SepalLengthSepalWidthPetalLengthPetalWidthSpecies
Float64Float64Float64Float64Categorical…
15.13.51.40.2setosa
24.93.01.40.2setosa
34.73.21.30.2setosa
44.63.11.50.2setosa
55.03.61.40.2setosa
65.43.91.70.4setosa
74.63.41.40.3setosa
85.03.41.50.2setosa
94.42.91.40.2setosa
104.93.11.50.1setosa
115.43.71.50.2setosa
124.83.41.60.2setosa
134.83.01.40.1setosa
144.33.01.10.1setosa
155.84.01.20.2setosa
165.74.41.50.4setosa
175.43.91.30.4setosa
185.13.51.40.3setosa
195.73.81.70.3setosa
205.13.81.50.3setosa
215.43.41.70.2setosa
225.13.71.50.4setosa
234.63.61.00.2setosa
245.13.31.70.5setosa
254.83.41.90.2setosa
265.03.01.60.2setosa
275.03.41.60.4setosa
285.23.51.50.2setosa
295.23.41.40.2setosa
304.73.21.60.2setosa

計算後變成一個新的一行

In [53]:
iris[:PetalArea] = iris[:PetalLength] .* iris[:PetalWidth]
first(iris, 10)
Out[53]:

10 rows × 6 columns

SepalLengthSepalWidthPetalLengthPetalWidthSpeciesPetalArea
Float64Float64Float64Float64Categorical…Float64
15.13.51.40.2setosa0.28
24.93.01.40.2setosa0.28
34.73.21.30.2setosa0.26
44.63.11.50.2setosa0.3
55.03.61.40.2setosa0.28
65.43.91.70.4setosa0.68
74.63.41.40.3setosa0.42
85.03.41.50.2setosa0.3
94.42.91.40.2setosa0.28
104.93.11.50.1setosa0.15

排序

In [54]:
sort!(iris, :SepalLength)
Out[54]:

150 rows × 6 columns

SepalLengthSepalWidthPetalLengthPetalWidthSpeciesPetalArea
Float64Float64Float64Float64Categorical…Float64
14.33.01.10.1setosa0.11
24.42.91.40.2setosa0.28
34.43.01.30.2setosa0.26
44.43.21.30.2setosa0.26
54.52.31.30.3setosa0.39
64.63.11.50.2setosa0.3
74.63.41.40.3setosa0.42
84.63.61.00.2setosa0.2
94.63.21.40.2setosa0.28
104.73.21.30.2setosa0.26
114.73.21.60.2setosa0.32
124.83.41.60.2setosa0.32
134.83.01.40.1setosa0.14
144.83.41.90.2setosa0.38
154.83.11.60.2setosa0.32
164.83.01.40.3setosa0.42
174.93.01.40.2setosa0.28
184.93.11.50.1setosa0.15
194.93.11.50.2setosa0.3
204.93.61.40.1setosa0.14
214.92.43.31.0versicolor3.3
224.92.54.51.7virginica7.65
235.03.61.40.2setosa0.28
245.03.41.50.2setosa0.3
255.03.01.60.2setosa0.32
265.03.41.60.4setosa0.64
275.03.21.20.2setosa0.24
285.03.51.30.3setosa0.39
295.03.51.60.6setosa0.96
305.03.31.40.2setosa0.28

排序多個欄位,並指定是否倒序

In [55]:
sort!(iris, [:Species, :SepalLength, :SepalWidth], rev=(true, false, false))
Out[55]:

150 rows × 6 columns

SepalLengthSepalWidthPetalLengthPetalWidthSpeciesPetalArea
Float64Float64Float64Float64Categorical…Float64
14.92.54.51.7virginica7.65
25.62.84.92.0virginica9.8
35.72.55.02.0virginica10.0
45.82.75.11.9virginica9.69
55.82.75.11.9virginica9.69
65.82.85.12.4virginica12.24
75.93.05.11.8virginica9.18
86.02.25.01.5virginica7.5
96.03.04.81.8virginica8.64
106.12.65.61.4virginica7.84
116.13.04.91.8virginica8.82
126.22.84.81.8virginica8.64
136.23.45.42.3virginica12.42
146.32.55.01.9virginica9.5
156.32.74.91.8virginica8.82
166.32.85.11.5virginica7.65
176.32.95.61.8virginica10.08
186.33.36.02.5virginica15.0
196.33.45.62.4virginica13.44
206.42.75.31.9virginica10.07
216.42.85.62.1virginica11.76
226.42.85.62.2virginica12.32
236.43.15.51.8virginica9.9
246.43.25.32.3virginica12.19
256.53.05.82.2virginica12.76
266.53.05.51.8virginica9.9
276.53.05.22.0virginica10.4
286.53.25.12.0virginica10.2
296.72.55.81.8virginica10.44
306.73.05.22.3virginica11.96

Join 運算

In [56]:
name = DataFrame(ID=[1, 2, 3], Name=["John Doe", "Jane Doe", "Andy Doe"])
Out[56]:

3 rows × 2 columns

IDName
Int64String
11John Doe
22Jane Doe
33Andy Doe
In [57]:
job = DataFrame(ID=[1, 2, 4], Job=["Lawyer", "Doctor", "Chief"])
Out[57]:

3 rows × 2 columns

IDJob
Int64String
11Lawyer
22Doctor
34Chief

inner join會輸出左表跟右表都包含的列

In [58]:
full = join(name, job, on=:ID)  # 預設的 join 是 inner join
Out[58]:

2 rows × 3 columns

IDNameJob
Int64StringString
11John DoeLawyer
22Jane DoeDoctor

left join會輸出以左表為主包含的列

In [59]:
left_join = join(name, job, on=:ID, kind=:left)
Out[59]:

3 rows × 3 columns

IDNameJob
Int64StringString⍰
11John DoeLawyer
22Jane DoeDoctor
33Andy Doemissing

right join會輸出右表為主包含的列

In [60]:
right_join = join(name, job, on=:ID, kind=:right)
Out[60]:

3 rows × 3 columns

IDNameJob
Int64String⍰String
11John DoeLawyer
22Jane DoeDoctor
34missingChief

outer join會輸出左表或是右表包含的列

In [61]:
outer_join = join(name, job, on=:ID, kind=:outer)
Out[61]:

4 rows × 3 columns

IDNameJob
Int64String⍰String⍰
11John DoeLawyer
22Jane DoeDoctor
33Andy Doemissing
44missingChief

其他的join:

  • Semi join: 類似inner join,但輸出只限於左表的行而已
  • Anti join: 輸出的列是key在左表存在,但右表不存在
  • Cross join: 輸出是左表跟右表所有列的所有排列組合 (Cartesian product)
In [62]:
cross_join = join(name, job, kind=:cross, makeunique=true)  # 不需要key
Out[62]:

9 rows × 4 columns

IDNameID_1Job
Int64StringInt64String
11John Doe1Lawyer
21John Doe2Doctor
31John Doe4Chief
42Jane Doe1Lawyer
52Jane Doe2Doctor
62Jane Doe4Chief
73Andy Doe1Lawyer
83Andy Doe2Doctor
93Andy Doe4Chief

The Split-Apply-Combine Strategy

除了可以像資料庫一樣join,還有運算的策略:

  • 拆分:把資料拆分成幾個群
  • 應用:在每群資料上運算一些函式
  • 合併:合併運算結果

By

In [63]:
by(iris, :Species, size)
Out[63]:

3 rows × 2 columns

Speciesx1
Categorical…Tuple…
1setosa(50, 6)
2versicolor(50, 6)
3virginica(50, 6)

根據不同:Species,去計算iris上的size()

參數:

  1. DataFrame
  2. 針對一個column去拆分DataFrame
  3. 會對每個群進行運算的一個function或是expression

可以支援 do 語法:

In [64]:
using StatsBase
WARNING: using StatsBase.df in module Main conflicts with an existing identifier.

適合用在複雜的function的情況

In [65]:
by(iris, :Species) do df
    DataFrame(m = StatsBase.mean(df[:, :PetalLength]),  = StatsBase.var(df[:, :PetalLength]))
end
Out[65]:

3 rows × 3 columns

Speciesm
Categorical…Float64Float64
1setosa1.4620.0301592
2versicolor4.260.220816
3virginica5.5520.304588

Aggregate

In [66]:
aggregate(iris, :Species, [sum, StatsBase.mean])
Out[66]:

3 rows × 11 columns

SpeciesSepalLength_sumSepalWidth_sumPetalLength_sumPetalWidth_sumPetalArea_sumSepalLength_meanSepalWidth_meanPetalLength_meanPetalWidth_meanPetalArea_mean
Categorical…Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64
1setosa250.3171.473.112.318.285.0063.4281.4620.2460.3656
2versicolor296.8138.5213.066.3286.025.9362.774.261.3265.7204
3virginica329.4148.7277.6101.3564.816.5882.9745.5522.02611.2962

以:Species分群,計算總和跟平均

arguments:

  1. DataFrame
  2. 針對一個或多個column去拆分DataFrame
  3. 會對每個群進行運算的多個function或是expression

Groupby

單純拆分資料

In [67]:
for subdf in groupby(iris, :Species)
    println(size(subdf, 1))
end
50
50
50

資料表的變形

"寬型"的表格型態

In [68]:
iris[:id] = 1:size(iris, 1)
iris
Out[68]:

150 rows × 7 columns

SepalLengthSepalWidthPetalLengthPetalWidthSpeciesPetalAreaid
Float64Float64Float64Float64Categorical…Float64Int64
14.92.54.51.7virginica7.651
25.62.84.92.0virginica9.82
35.72.55.02.0virginica10.03
45.82.75.11.9virginica9.694
55.82.75.11.9virginica9.695
65.82.85.12.4virginica12.246
75.93.05.11.8virginica9.187
86.02.25.01.5virginica7.58
96.03.04.81.8virginica8.649
106.12.65.61.4virginica7.8410
116.13.04.91.8virginica8.8211
126.22.84.81.8virginica8.6412
136.23.45.42.3virginica12.4213
146.32.55.01.9virginica9.514
156.32.74.91.8virginica8.8215
166.32.85.11.5virginica7.6516
176.32.95.61.8virginica10.0817
186.33.36.02.5virginica15.018
196.33.45.62.4virginica13.4419
206.42.75.31.9virginica10.0720
216.42.85.62.1virginica11.7621
226.42.85.62.2virginica12.3222
236.43.15.51.8virginica9.923
246.43.25.32.3virginica12.1924
256.53.05.82.2virginica12.7625
266.53.05.51.8virginica9.926
276.53.05.22.0virginica10.427
286.53.25.12.0virginica10.228
296.72.55.81.8virginica10.4429
306.73.05.22.3virginica11.9630

將指定的欄位併縮到資料中,變成"長型"的表格

In [69]:
d = stack(iris, [:SepalLength, :SepalWidth, :PetalLength, :PetalWidth])
Out[69]:

600 rows × 5 columns

variablevalueSpeciesPetalAreaid
SymbolFloat64Categorical…Float64Int64
1SepalLength4.9virginica7.651
2SepalLength5.6virginica9.82
3SepalLength5.7virginica10.03
4SepalLength5.8virginica9.694
5SepalLength5.8virginica9.695
6SepalLength5.8virginica12.246
7SepalLength5.9virginica9.187
8SepalLength6.0virginica7.58
9SepalLength6.0virginica8.649
10SepalLength6.1virginica7.8410
11SepalLength6.1virginica8.8211
12SepalLength6.2virginica8.6412
13SepalLength6.2virginica12.4213
14SepalLength6.3virginica9.514
15SepalLength6.3virginica8.8215
16SepalLength6.3virginica7.6516
17SepalLength6.3virginica10.0817
18SepalLength6.3virginica15.018
19SepalLength6.3virginica13.4419
20SepalLength6.4virginica10.0720
21SepalLength6.4virginica11.7621
22SepalLength6.4virginica12.3222
23SepalLength6.4virginica9.923
24SepalLength6.4virginica12.1924
25SepalLength6.5virginica12.7625
26SepalLength6.5virginica9.926
27SepalLength6.5virginica10.427
28SepalLength6.5virginica10.228
29SepalLength6.7virginica10.4429
30SepalLength6.7virginica11.9630

併縮指定的欄位,並選擇其他欄位

In [70]:
d = stack(iris, [:SepalLength, :SepalWidth], :Species)
Out[70]:

300 rows × 3 columns

variablevalueSpecies
SymbolFloat64Categorical…
1SepalLength4.9virginica
2SepalLength5.6virginica
3SepalLength5.7virginica
4SepalLength5.8virginica
5SepalLength5.8virginica
6SepalLength5.8virginica
7SepalLength5.9virginica
8SepalLength6.0virginica
9SepalLength6.0virginica
10SepalLength6.1virginica
11SepalLength6.1virginica
12SepalLength6.2virginica
13SepalLength6.2virginica
14SepalLength6.3virginica
15SepalLength6.3virginica
16SepalLength6.3virginica
17SepalLength6.3virginica
18SepalLength6.3virginica
19SepalLength6.3virginica
20SepalLength6.4virginica
21SepalLength6.4virginica
22SepalLength6.4virginica
23SepalLength6.4virginica
24SepalLength6.4virginica
25SepalLength6.5virginica
26SepalLength6.5virginica
27SepalLength6.5virginica
28SepalLength6.5virginica
29SepalLength6.7virginica
30SepalLength6.7virginica

將長型表格轉為寬型

第2個參數是可辨識列的欄位,第3、4個參數分別是併縮時的欄位名跟值

In [71]:
d = stack(iris, [:SepalLength, :SepalWidth, :PetalLength, :PetalWidth])
unstack(d, :id, :variable, :value)
Out[71]:

150 rows × 5 columns

idPetalLengthPetalWidthSepalLengthSepalWidth
Int64Float64⍰Float64⍰Float64⍰Float64⍰
114.51.74.92.5
224.92.05.62.8
335.02.05.72.5
445.11.95.82.7
555.11.95.82.7
665.12.45.82.8
775.11.85.93.0
885.01.56.02.2
994.81.86.03.0
10105.61.46.12.6
11114.91.86.13.0
12124.81.86.22.8
13135.42.36.23.4
14145.01.96.32.5
15154.91.86.32.7
16165.11.56.32.8
17175.61.86.32.9
18186.02.56.33.3
19195.62.46.33.4
20205.31.96.42.7
21215.62.16.42.8
22225.62.26.42.8
23235.51.86.43.1
24245.32.36.43.2
25255.82.26.53.0
26265.51.86.53.0
27275.22.06.53.0
28285.12.06.53.2
29295.81.86.72.5
30305.22.36.73.0

若是其餘的欄位不重複,也可以不指定辨識欄位

In [72]:
unstack(d, :variable, :value)
Out[72]:

150 rows × 7 columns

SpeciesPetalAreaidPetalLengthPetalWidthSepalLengthSepalWidth
Categorical…Float64Int64Float64⍰Float64⍰Float64⍰Float64⍰
1virginica7.585.01.56.02.2
2virginica7.6514.51.74.92.5
3virginica7.65165.11.56.32.8
4virginica7.84105.61.46.12.6
5virginica8.6494.81.86.03.0
6virginica8.64124.81.86.22.8
7virginica8.82114.91.86.13.0
8virginica8.82154.91.86.32.7
9virginica9.1875.11.85.93.0
10virginica9.28405.81.67.23.0
11virginica9.5145.01.96.32.5
12virginica9.6945.11.95.82.7
13virginica9.6955.11.95.82.7
14virginica9.824.92.05.62.8
15virginica9.9235.51.86.43.1
16virginica9.9265.51.86.53.0
17virginica10.035.02.05.72.5
18virginica10.07205.31.96.42.7
19virginica10.08175.61.86.32.9
20virginica10.2285.12.06.53.2
21virginica10.4275.22.06.53.0
22virginica10.44295.81.86.72.5
23virginica10.8416.01.87.23.2
24virginica11.34436.31.87.32.9
25virginica11.34365.42.16.93.1
26virginica11.55345.52.16.83.0
27virginica11.59446.11.97.42.8
28virginica11.73375.12.36.93.1
29virginica11.76215.62.16.42.8
30virginica11.96305.22.36.73.0

應用之前學過的就可以將資料任意變形與運算

對不同的:Species,分別計算他們各項特徵的平均

In [73]:
d = stack(iris)
x = by(d, [:variable, :Species], df -> DataFrame(vmean = StatsBase.mean(df[:, :value])))
unstack(x, :Species, :vmean)
Out[73]:

5 rows × 4 columns

variablevirginicaversicolorsetosa
SymbolFloat64⍰Float64⍰Float64⍰
1PetalArea11.29625.72040.3656
2PetalLength5.5524.261.462
3PetalWidth2.0261.3260.246
4SepalLength6.5885.9365.006
5SepalWidth2.9742.773.428

Gadfly

我們使用的是 Gadfly.jl 這個 module,所以還沒安裝的快去安裝吧~

In [74]:
using Gadfly
┌ Info: Loading DataFrames support into Gadfly.jl
└ @ Gadfly /home/pika/.julia/packages/Gadfly/ew1SM/src/mapping.jl:228

Tastes

In [75]:
p = plot(iris, x=:SepalLength, y=:SepalWidth, Geom.point)
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[75]:
SepalLength -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 7.2 7.4 7.6 7.8 8.0 8.2 8.4 8.6 8.8 9.0 9.2 9.4 9.6 9.8 10.0 10.2 10.4 10.6 10.8 11.0 11.2 11.4 11.6 11.8 12.0 0 5 10 15 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 7.0 -2.5 0.0 2.5 5.0 7.5 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 SepalWidth
In [76]:
img = SVG("iris_plot.svg", 14cm, 8cm)
draw(img, p)
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[76]:
false

With lines

In [77]:
plot(iris, x=:SepalLength, y=:SepalWidth, Geom.point, Geom.line)
Out[77]:
SepalLength -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 7.2 7.4 7.6 7.8 8.0 8.2 8.4 8.6 8.8 9.0 9.2 9.4 9.6 9.8 10.0 10.2 10.4 10.6 10.8 11.0 11.2 11.4 11.6 11.8 12.0 0 5 10 15 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 7.0 -2.5 0.0 2.5 5.0 7.5 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 SepalWidth
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271

Plotting arrays

我們先從最簡單的Array開始,只要分別指定X跟Y的座標給他,他就可以幫你畫出圖來

In [78]:
plot(x=randn(20), y=randn(20))
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[78]:
x -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 -8.0 -7.8 -7.6 -7.4 -7.2 -7.0 -6.8 -6.6 -6.4 -6.2 -6.0 -5.8 -5.6 -5.4 -5.2 -5.0 -4.8 -4.6 -4.4 -4.2 -4.0 -3.8 -3.6 -3.4 -3.2 -3.0 -2.8 -2.6 -2.4 -2.2 -2.0 -1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 -10 -5 0 5 10 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 -8.0 -7.8 -7.6 -7.4 -7.2 -7.0 -6.8 -6.6 -6.4 -6.2 -6.0 -5.8 -5.6 -5.4 -5.2 -5.0 -4.8 -4.6 -4.4 -4.2 -4.0 -3.8 -3.6 -3.4 -3.2 -3.0 -2.8 -2.6 -2.4 -2.2 -2.0 -1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 -10 -5 0 5 10 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 y

Aesthetics

In [79]:
plot(x=rand(10), y=rand(10), Geom.point, Geom.line)
Out[79]:
x -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75 -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95 2.00 -1 0 1 2 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75 -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95 2.00 -1 0 1 2 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 y
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271

藉著我們要在上面加一些線,把這些點連起來,所以我們會用到兩個參數:

  • Geom.point: 繪製點
  • Geom.line: 繪製線,變成折線圖

Scale and guide

In [80]:
plot(x=1:10, y=2 .^rand(10), Scale.y_sqrt, Geom.point, Geom.smooth,
     Guide.xlabel("Stimulus"), Guide.ylabel("Response"), Guide.title("Dog Training"))
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[80]:
Stimulus -12.5 -10.0 -7.5 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 -10.0 -9.5 -9.0 -8.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5 17.0 17.5 18.0 18.5 19.0 19.5 20.0 -10 0 10 20 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? 0.52 0.62 0.72 0.82 0.92 1.02 1.12 1.22 1.32 1.42 1.52 1.62 1.72 1.82 1.92 0.002 0.022 0.042 0.062 0.082 0.102 0.122 0.142 0.162 0.182 0.202 0.222 0.242 0.262 0.282 0.302 0.322 0.342 0.362 0.382 0.402 0.422 0.442 0.462 0.482 0.502 0.522 0.542 0.562 0.582 0.602 0.622 0.642 0.662 0.682 0.702 0.722 0.742 0.762 0.782 0.802 0.822 0.842 0.862 0.882 0.902 0.922 0.942 0.962 0.982 1.002 1.022 1.042 1.062 1.082 1.102 1.122 1.142 1.162 1.182 1.202 1.222 1.242 1.262 1.282 1.302 1.322 1.342 1.362 1.382 1.402 1.422 1.442 1.462 1.482 1.502 1.522 1.542 1.562 1.582 1.602 1.622 1.642 1.662 1.682 1.702 1.722 1.742 1.762 1.782 1.802 1.822 0.02 0.52 1.02 1.52 2.02 0.002 0.052 0.102 0.152 0.202 0.252 0.302 0.352 0.402 0.452 0.502 0.552 0.602 0.652 0.702 0.752 0.802 0.852 0.902 0.952 1.002 1.052 1.102 1.152 1.202 1.252 1.302 1.352 1.402 1.452 1.502 1.552 1.602 1.652 1.702 1.752 1.802 1.852 Response Dog Training

如果有需要調整軸的尺度或是操作標題跟軸線,可以用以下的參數:

  • Scale: 操作軸的尺度
  • Geom: 操作點跟線或是內容的呈現
  • Guide: 增加標題跟軸的標示

Save

當你把你要的圖畫好之後,就可以存成不同的檔案了

myplot = plot(..)

draw(SVG("myplot.svg", 4inch, 3inch), myplot)
draw(PNG("myplot.png", 4inch, 3inch), myplot)
draw(PDF("myplot.pdf", 4inch, 3inch), myplot)
draw(PS("myplot.ps", 4inch, 3inch), myplot)
draw(D3("myplot.js", 4inch, 3inch), myplot)

Plotting data frames

當你的資料裝在DataFrame裡的時候,只要指定欄位名稱就可以了

注意,第一個是放DataFrame

In [81]:
plot(iris, x="SepalLength", y="SepalWidth", Geom.point)
Out[81]:
SepalLength -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 7.2 7.4 7.6 7.8 8.0 8.2 8.4 8.6 8.8 9.0 9.2 9.4 9.6 9.8 10.0 10.2 10.4 10.6 10.8 11.0 11.2 11.4 11.6 11.8 12.0 0 5 10 15 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 7.0 -2.5 0.0 2.5 5.0 7.5 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8 7.0 SepalWidth
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271

Histogram

In [82]:
plot(dataset("car", "SLID"), x="Wages", color="Language", Geom.histogram)
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[82]:
Wages -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 -50 -48 -46 -44 -42 -40 -38 -36 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100 -50 0 50 100 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 English Other French Language h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -250 -200 -150 -100 -50 0 50 100 150 200 250 300 350 400 450 -200 -190 -180 -170 -160 -150 -140 -130 -120 -110 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 -200 0 200 400 -200 -180 -160 -140 -120 -100 -80 -60 -40 -20 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400

Functions and Expressions

當你要畫的東西是function的話,那你就需要用到這邊的語法了

plot(f::Function, a, b, ...)

a跟b分別是圖形的起始位置跟終點位置,以X軸計算

In [83]:
plot(sin, 0, 25)
Out[83]:
x -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 -25 -24 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 -25 0 25 50 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 -3.0 -2.9 -2.8 -2.7 -2.6 -2.5 -2.4 -2.3 -2.2 -2.1 -2.0 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0 -4 -2 0 2 4 -3.0 -2.8 -2.6 -2.4 -2.2 -2.0 -1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 f(x)
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271

另一種是畫多個function:

plot(fs::Array, a, b, ...)
In [84]:
plot([sin, cos], 0, 25)
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[84]:
x -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 -25 -24 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 -25 0 25 50 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 f1 f2 Color h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 -3.0 -2.9 -2.8 -2.7 -2.6 -2.5 -2.4 -2.3 -2.2 -2.1 -2.0 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0 -4 -2 0 2 4 -3.0 -2.8 -2.6 -2.4 -2.2 -2.0 -1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 f(x)
In [85]:
plot((x) -> sin(x)/x, 0.001, 1000, Scale.x_log)
Out[85]:
x e-35 e-30 e-25 e-20 e-15 e-10 e-5 e0 e5 e10 e15 e20 e25 e30 e35 e-30 e-29 e-28 e-27 e-26 e-25 e-24 e-23 e-22 e-21 e-20 e-19 e-18 e-17 e-16 e-15 e-14 e-13 e-12 e-11 e-10 e-9 e-8 e-7 e-6 e-5 e-4 e-3 e-2 e-1 e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 e10 e11 e12 e13 e14 e15 e16 e17 e18 e19 e20 e21 e22 e23 e24 e25 e26 e27 e28 e29 e30 e-40 e-20 e0 e20 e40 e-30 e-28 e-26 e-24 e-22 e-20 e-18 e-16 e-14 e-12 e-10 e-8 e-6 e-4 e-2 e0 e2 e4 e6 e8 e10 e12 e14 e16 e18 e20 e22 e24 e26 e28 e30 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 -2.00 -1.95 -1.90 -1.85 -1.80 -1.75 -1.70 -1.65 -1.60 -1.55 -1.50 -1.45 -1.40 -1.35 -1.30 -1.25 -1.20 -1.15 -1.10 -1.05 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75 -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95 2.00 2.05 2.10 2.15 2.20 2.25 2.30 2.35 2.40 2.45 2.50 -2 0 2 4 -2.0 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 f(x)
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271

Layers

當你需要在同一張上畫很多圖,而且他們是疊起來的,你需要用到layer這東西

In [86]:
plot(layer(x=rand(10), y=rand(10), Geom.point), layer(x=rand(10), y=rand(10), Geom.line))
Out[86]:
x -1.25 -1.00 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75 -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95 2.00 -1 0 1 2 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75 -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95 2.00 -1 0 1 2 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 y
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
In [87]:
layer1 = layer(x=rand(10), y=rand(10), Geom.point)
layer2 = layer(x=rand(10), y=rand(10), Geom.line)
plot(layer1, layer2)  # 當有太多layer要畫的時候我就會用變數把他們分開
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[87]:
x -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75 -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95 2.00 -1 0 1 2 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75 -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95 2.00 -1 0 1 2 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 y

Geom

接下來介紹一下常用的圖

直方圖(Histogram)

In [88]:
plot(dataset("ggplot2", "diamonds"), x="Price", color="Cut", Geom.histogram)
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[88]:
Price -2.5×10⁴ -2.0×10⁴ -1.5×10⁴ -1.0×10⁴ -5.0×10³ 0 5.0×10³ 1.0×10⁴ 1.5×10⁴ 2.0×10⁴ 2.5×10⁴ 3.0×10⁴ 3.5×10⁴ 4.0×10⁴ 4.5×10⁴ -2.0×10⁴ -1.9×10⁴ -1.8×10⁴ -1.7×10⁴ -1.6×10⁴ -1.5×10⁴ -1.4×10⁴ -1.3×10⁴ -1.2×10⁴ -1.1×10⁴ -1.0×10⁴ -9.0×10³ -8.0×10³ -7.0×10³ -6.0×10³ -5.0×10³ -4.0×10³ -3.0×10³ -2.0×10³ -1.0×10³ 0 1.0×10³ 2.0×10³ 3.0×10³ 4.0×10³ 5.0×10³ 6.0×10³ 7.0×10³ 8.0×10³ 9.0×10³ 1.0×10⁴ 1.1×10⁴ 1.2×10⁴ 1.3×10⁴ 1.4×10⁴ 1.5×10⁴ 1.6×10⁴ 1.7×10⁴ 1.8×10⁴ 1.9×10⁴ 2.0×10⁴ 2.1×10⁴ 2.2×10⁴ 2.3×10⁴ 2.4×10⁴ 2.5×10⁴ 2.6×10⁴ 2.7×10⁴ 2.8×10⁴ 2.9×10⁴ 3.0×10⁴ 3.1×10⁴ 3.2×10⁴ 3.3×10⁴ 3.4×10⁴ 3.5×10⁴ 3.6×10⁴ 3.7×10⁴ 3.8×10⁴ 3.9×10⁴ 4.0×10⁴ -2×10⁴ 0 2×10⁴ 4×10⁴ -2.0×10⁴ -1.8×10⁴ -1.6×10⁴ -1.4×10⁴ -1.2×10⁴ -1.0×10⁴ -8.0×10³ -6.0×10³ -4.0×10³ -2.0×10³ 0 2.0×10³ 4.0×10³ 6.0×10³ 8.0×10³ 1.0×10⁴ 1.2×10⁴ 1.4×10⁴ 1.6×10⁴ 1.8×10⁴ 2.0×10⁴ 2.2×10⁴ 2.4×10⁴ 2.6×10⁴ 2.8×10⁴ 3.0×10⁴ 3.2×10⁴ 3.4×10⁴ 3.6×10⁴ 3.8×10⁴ 4.0×10⁴ Ideal Premium Good Very Good Fair Cut h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -5×10³ -4×10³ -3×10³ -2×10³ -1×10³ 0 1×10³ 2×10³ 3×10³ 4×10³ 5×10³ 6×10³ 7×10³ 8×10³ 9×10³ -4.0×10³ -3.8×10³ -3.6×10³ -3.4×10³ -3.2×10³ -3.0×10³ -2.8×10³ -2.6×10³ -2.4×10³ -2.2×10³ -2.0×10³ -1.8×10³ -1.6×10³ -1.4×10³ -1.2×10³ -1.0×10³ -8.0×10² -6.0×10² -4.0×10² -2.0×10² 0 2.0×10² 4.0×10² 6.0×10² 8.0×10² 1.0×10³ 1.2×10³ 1.4×10³ 1.6×10³ 1.8×10³ 2.0×10³ 2.2×10³ 2.4×10³ 2.6×10³ 2.8×10³ 3.0×10³ 3.2×10³ 3.4×10³ 3.6×10³ 3.8×10³ 4.0×10³ 4.2×10³ 4.4×10³ 4.6×10³ 4.8×10³ 5.0×10³ 5.2×10³ 5.4×10³ 5.6×10³ 5.8×10³ 6.0×10³ 6.2×10³ 6.4×10³ 6.6×10³ 6.8×10³ 7.0×10³ 7.2×10³ 7.4×10³ 7.6×10³ 7.8×10³ 8.0×10³ -5×10³ 0 5×10³ 1×10⁴ -4.0×10³ -3.5×10³ -3.0×10³ -2.5×10³ -2.0×10³ -1.5×10³ -1.0×10³ -5.0×10² 0 5.0×10² 1.0×10³ 1.5×10³ 2.0×10³ 2.5×10³ 3.0×10³ 3.5×10³ 4.0×10³ 4.5×10³ 5.0×10³ 5.5×10³ 6.0×10³ 6.5×10³ 7.0×10³ 7.5×10³ 8.0×10³

機率分佈圖(Kernel density estimation)

In [89]:
plot(dataset("ggplot2", "diamonds"), x="Price", color="Cut", Geom.density)
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[89]:
Price -6×10⁴ -5×10⁴ -4×10⁴ -3×10⁴ -2×10⁴ -1×10⁴ 0 1×10⁴ 2×10⁴ 3×10⁴ 4×10⁴ 5×10⁴ 6×10⁴ 7×10⁴ 8×10⁴ -5.0×10⁴ -4.8×10⁴ -4.6×10⁴ -4.4×10⁴ -4.2×10⁴ -4.0×10⁴ -3.8×10⁴ -3.6×10⁴ -3.4×10⁴ -3.2×10⁴ -3.0×10⁴ -2.8×10⁴ -2.6×10⁴ -2.4×10⁴ -2.2×10⁴ -2.0×10⁴ -1.8×10⁴ -1.6×10⁴ -1.4×10⁴ -1.2×10⁴ -1.0×10⁴ -8.0×10³ -6.0×10³ -4.0×10³ -2.0×10³ 0 2.0×10³ 4.0×10³ 6.0×10³ 8.0×10³ 1.0×10⁴ 1.2×10⁴ 1.4×10⁴ 1.6×10⁴ 1.8×10⁴ 2.0×10⁴ 2.2×10⁴ 2.4×10⁴ 2.6×10⁴ 2.8×10⁴ 3.0×10⁴ 3.2×10⁴ 3.4×10⁴ 3.6×10⁴ 3.8×10⁴ 4.0×10⁴ 4.2×10⁴ 4.4×10⁴ 4.6×10⁴ 4.8×10⁴ 5.0×10⁴ 5.2×10⁴ 5.4×10⁴ 5.6×10⁴ 5.8×10⁴ 6.0×10⁴ 6.2×10⁴ 6.4×10⁴ 6.6×10⁴ 6.8×10⁴ 7.0×10⁴ -5×10⁴ 0 5×10⁴ 1×10⁵ -5.0×10⁴ -4.5×10⁴ -4.0×10⁴ -3.5×10⁴ -3.0×10⁴ -2.5×10⁴ -2.0×10⁴ -1.5×10⁴ -1.0×10⁴ -5.0×10³ 0 5.0×10³ 1.0×10⁴ 1.5×10⁴ 2.0×10⁴ 2.5×10⁴ 3.0×10⁴ 3.5×10⁴ 4.0×10⁴ 4.5×10⁴ 5.0×10⁴ 5.5×10⁴ 6.0×10⁴ 6.5×10⁴ 7.0×10⁴ Ideal Premium Good Very Good Fair Cut h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -0.0006 -0.0005 -0.0004 -0.0003 -0.0002 -0.0001 0.0000 0.0001 0.0002 0.0003 0.0004 0.0005 0.0006 0.0007 0.0008 0.0009 0.0010 0.0011 -0.00050 -0.00048 -0.00046 -0.00044 -0.00042 -0.00040 -0.00038 -0.00036 -0.00034 -0.00032 -0.00030 -0.00028 -0.00026 -0.00024 -0.00022 -0.00020 -0.00018 -0.00016 -0.00014 -0.00012 -0.00010 -0.00008 -0.00006 -0.00004 -0.00002 0.00000 0.00002 0.00004 0.00006 0.00008 0.00010 0.00012 0.00014 0.00016 0.00018 0.00020 0.00022 0.00024 0.00026 0.00028 0.00030 0.00032 0.00034 0.00036 0.00038 0.00040 0.00042 0.00044 0.00046 0.00048 0.00050 0.00052 0.00054 0.00056 0.00058 0.00060 0.00062 0.00064 0.00066 0.00068 0.00070 0.00072 0.00074 0.00076 0.00078 0.00080 0.00082 0.00084 0.00086 0.00088 0.00090 0.00092 0.00094 0.00096 0.00098 0.00100 -0.0005 0.0000 0.0005 0.0010 -0.00050 -0.00045 -0.00040 -0.00035 -0.00030 -0.00025 -0.00020 -0.00015 -0.00010 -0.00005 0.00000 0.00005 0.00010 0.00015 0.00020 0.00025 0.00030 0.00035 0.00040 0.00045 0.00050 0.00055 0.00060 0.00065 0.00070 0.00075 0.00080 0.00085 0.00090 0.00095 0.00100

2D 直方圖(X跟Y分別是不同的維度,計數的多寡以顏色表示)

In [90]:
plot(dataset("car", "Womenlf"), x="HIncome", y="Region", Geom.histogram2d)
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[90]:
HIncome -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 -50 -48 -46 -44 -42 -40 -38 -36 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100 -50 0 50 100 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 10 1 5 15 Count h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? Ontario Prairie Atlantic BC Quebec Region

Hexbin plot

In [91]:
using Distributions
In [92]:
X = rand(MultivariateNormal([0.0, 0.0], [1.0 0.01; 0.01 1.0]), 10000);
plot(x=X[1,:], y=X[2,:], Geom.hexbin)
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[92]:
x -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 -12.0 -11.5 -11.0 -10.5 -10.0 -9.5 -9.0 -8.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 -20 -10 0 10 20 -12.0 -11.5 -11.0 -10.5 -10.0 -9.5 -9.0 -8.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 10 1 5 15 Count h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 -12.0 -11.5 -11.0 -10.5 -10.0 -9.5 -9.0 -8.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 -20 -10 0 10 20 -12.0 -11.5 -11.0 -10.5 -10.0 -9.5 -9.0 -8.5 -8.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 y

等高線圖(Contour, 適合用來表達二維的函數圖形)

In [93]:
plot(z=(x,y) -> x*exp(-(x-round(Int, x))^2-y^2),
     x=range(-8, stop=8, step=0.01), y=range(-2, stop=2, step=0.01), Geom.contour)
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[93]:
x -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 -30 -29 -28 -27 -26 -25 -24 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 -40 -20 0 20 40 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 -5 0 5 -10 10 Color h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 -6.0 -5.8 -5.6 -5.4 -5.2 -5.0 -4.8 -4.6 -4.4 -4.2 -4.0 -3.8 -3.6 -3.4 -3.2 -3.0 -2.8 -2.6 -2.4 -2.2 -2.0 -1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 -6 -3 0 3 6 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 y
In [94]:
volcano = convert(Array{Float64}, dataset("datasets", "volcano"))
plot(z=volcano, Geom.contour)
Out[94]:
-125 -100 -75 -50 -25 0 25 50 75 100 125 150 175 200 225 -100 -95 -90 -85 -80 -75 -70 -65 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 130 135 140 145 150 155 160 165 170 175 180 185 190 195 200 -100 0 100 200 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 150 100 200 Color h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? -100 -80 -60 -40 -20 0 20 40 60 80 100 120 140 160 180 -80 -75 -70 -65 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 130 135 140 145 150 155 160 -100 0 100 200 -80 -75 -70 -65 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 130 135 140 145 150 155 160
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271

盒狀圖(Boxplot)

In [95]:
plot(dataset("lattice", "singer"), x="VoicePart", y="Height", Geom.boxplot)
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[95]:
VoicePart Soprano 1 Soprano 2 Alto 1 Alto 2 Tenor 1 Tenor 2 Bass 1 Bass 2 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 40 60 80 100 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100 Height

Beeswarm plot

In [96]:
plot(dataset("lattice", "singer"), x="VoicePart", y="Height", Geom.beeswarm)
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
Out[96]:
VoicePart Soprano 1 Soprano 2 Alto 1 Alto 2 Tenor 1 Tenor 2 Bass 1 Bass 2 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 40 60 80 100 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100 Height

Violin plot

In [97]:
plot(dataset("lattice", "singer"), x="VoicePart", y="Height", Geom.violin)
Out[97]:
VoicePart Soprano 1 Soprano 2 Alto 1 Alto 2 Tenor 1 Tenor 2 Bass 1 Bass 2 h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100 102 104 106 108 110 112 114 116 118 120 122 124 126 128 130 0 50 100 150 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 130 Height
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271

熱圖(Heatmap)

In [98]:
plot(dataset("Zelig", "macro"), x="Year", y="Country", color="GDP", Geom.rectbin)
Out[98]:
Year 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020 2030 2040 2050 1920 1922 1924 1926 1928 1930 1932 1934 1936 1938 1940 1942 1944 1946 1948 1950 1952 1954 1956 1958 1960 1962 1964 1966 1968 1970 1972 1974 1976 1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024 2026 2028 2030 2032 2034 2036 2038 2040 1900 1950 2000 2050 1920 1925 1930 1935 1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020 2025 2030 2035 2040 0 5 10 -5 15 GDP h,j,k,l,arrows,drag to pan i,o,+,-,scroll,shift-drag to zoom r,dbl-click to reset c for coordinates ? for help ? United States Canada United Kingdom Netherlands Belgium France West Germany Austria Italy Finland Sweden Norway Denmark Japan Country
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271
┌ Warning: For svg transparent colors, use either e.g. fill(RGBA(r,g,b,a)) or fillopacity(a), but not both.
└ @ Compose /home/pika/.julia/packages/Compose/wlPCt/src/svg.jl:1271

Q&A

其他套件:

  • JSON.jl
  • LightXML.jl
  • HDF5.jl
  • JLD2.jl
  • SQLite.jl
  • MySQL.jl